Auditory Time-Frequency Masking: Psychoacoustical Data and Application to Audio Representations
نویسندگان
چکیده
In this paper, the results of psychoacoustical experiments on auditory time-frequency (TF) masking using stimuli (masker and target) with maximal concentration in the TF plane are presented. The target was shifted either along the time axis, the frequency axis, or both relative to the masker. The results show that a simple superposition of spectral and temporal masking functions does not provide an accurate representation of the measured TF masking function. This confirms the inaccuracy of simple models of TF masking currently implemented in some perceptual audio codecs. In the context of audio signal processing, the present results constitute a crucial basis for the prediction of auditory masking in the TF representations of sounds. An algorithm that removes the inaudible components in the wavelet transform of a sound while causing no audible difference to the original sound after re-synthesis is proposed. Preliminary results are promising, although further development is required.
منابع مشابه
A Gammatone-based Psychoacoustical Modeling Approach for Speech and Audio Coding
We propose a new approach for modeling auditory masking based on gammatone filters for application areas including speech/audio coding and audio watermarking. Besides the use of gammatone filters, this model differs from existing audio coding psychoacoustical models (e.g., the ones used in MPEG), in taking into account the contribution of a range of filters in computing the distortion, rather t...
متن کاملA computationally efficient cochlear filter bank for perceptual audio coding
Many applications in auditory modeling require analysis filters that approximate the frequency selectivity given by psychophysical data, e.g. from masking experiments using narrow-band maskers. This frequency selectivity is largely determined by the spectral decomposition process inside the human cochlea. Currently used spectral decomposition schemes for masking modeling in audio coding general...
متن کاملAuditory-inspired sparse representation of audio signals
This article deals with the generation of auditory-inspired spectro-temporal features aimed at audio coding. To do so, we first generate sparse audio representations we call spikegrams, using projections on gammatone/gammachirp kernels that generate neural spikes. Unlike Fourier-based representations, these representations are powerful at identifying auditory events, such as onsets, offsets, tr...
متن کاملA Perceptual Model for Sinusoidal Audio Coding Based on Spectral Integration
Psychoacoustical models have been used extensively within audio coding applications over the past decades. Recently, parametric coding techniques have been applied to general audio and this has created the need for a psychoacoustical model that is specifically suited for sinusoidal modelling of audio signals. In this paper, we present a new perceptual model that predicts masked thresholds for s...
متن کاملA psychoacoustical model of the auditory periphery as front end for ASR
The application of a psychoacoustical model of the auditory periphery in the field of automatic speech recognition (ASR) is presented. The model was developed to quantitatively predict human performance in typical spectral and temporal masking experiments. Speaker-independent, isolated-digit recognition experiments in different types of noise were carried out to evaluate the robustness of the a...
متن کامل